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Larsson B The stability of results: Some examples of the effects of 
scale transformations. Didakometry (Malmo. Sweden: School of 
Education), No. 42, 1974. 

When the admissible class of transformations for a scale; defined to 
measure a certain concept, is broader than the cla^s of transformations 
for which a given index-of result is invariant, the question of the 
stability of results arises; In such situations one may be interested 
in finding the range of the index or perhaps that transformation which 
maximizes or minimizes the index. The technique used her,e to 
obtain these objects is to express a,variable with many categories 
as a weighted surn of its binary variables; the weights being the scale 

""^^this report gives some simple examples of stability lor one factor 
and 2 X 2 factorial analysis of variance, reliability and correlations. 
The findings are very different: from super stability (no transformation 
whatsoever can change the result) to almost total instability. This is 
followed by a discussion of applications to multivariate analysis, and 
by some final remarks. It can be added that the technique can also 
be utilized for scaling variables to obtain a best fit to mathematical 
models other than those involved in usual statistical analysis. 
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INTRODUCTION ' 

Scales, used in educational research are, as a rule, loosely defined. 
The most common numerical coding of the possible outcomes of a 
measurement is successive integers. However, there is seldom anything 
m the educational measurement procedure which prescribes this rather 
than any other coding. Educational researchers will in most cases not 
have any fundamental objection to exchanging this coding for a monotonic 

t.ransformation of it.. 

On the other hand, many statistical methods (or other mathematical 
models used in educational research) are only invariant e, g. up to linear 
transformations. The question is': how stable are results, de scrib.ed'by these 
methods, when monotonic transformations constitute the class of acceptable 
codings? High stability admits conclusions with great generality. It may 
also be of interest to choose that scale which, under given restrictions, 
maximizes (or minimizes) a c<^rtain index of result. 

The techniques used for investigating the stability are based on a general 
principle. By using binary coding, each many-valued variable can be 
expressed as a^veighted sum of its binary variables, where the weights 
are the scale values. This im.plies that almost all analysis will be multivaria- 
te, e.g. a certain type of analysis of variance (ANOVA) is transferred to the 
corresponding discriminant analysis, modified due to some restriction of 
t r an s f o r m ati on s . 

This report gives some simple examples of the stability of results for 
some statistical methods, viz. one factor and 2x2 factorial ANOVA with _ 
different cell samples, reliability estimates from a one factor ANOVA with 
repeated measures design, and product -moment correlations. The report 
also discusses, though without examples, some possible extensions to 
multivariate methods. With one exception, the examples only treat three 
or four-valued variables, which make it possible to visualize the results 
on graphs. The data are, again with one exception, artificial, constructed 
to constitute a first test of some optimization routines. 

It can be added that the binary coding technique is not limited to 
statistical methods. We can use it to code variables, under given restrictions, 
to obtain a best fit to a certain mathematical model (described by a goodness- 
of-fit criterion chosen). If this optimal fit is bad, the conclusion that the 
model is unsuitable will be quite general. For instance, we may code a 
variable to obtain a certain distribution function, code two variables to obtain 
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a given linear relation, or code a learning variable to be a specified function 
of the number of trials. 

METHOD 

In this section we will first describe the binary coding technique and 
relations between and within many -valued variables and binary variables. 
Some comments are then made on the general forms of the indices of result 
used in the examples, followed by a short discussion of the concept of 
stability and some simulations. 

Binary coding ' 



The idea of binary coding is npt new^ Some information about it is given 

-in Lars son (1973) and Bradley et al. (1962) use it in a modified form. 

The description will here be sufficiently general to cover also most of the 

discussion concerning multivariat^-'^nalysis. * 

Let X., i = 1, . . . • , p, be a many-valued variable with + i categories. 

There are n. measurement objects characterized by category g, which 

has the numerical code a. ; g = 0, 1, . . . . , k^. The categories are often 

ordered and in such cases g indicates the order. In the sequel we only 

regaid a certain standardized coding having a.Q = 0 and 2t - ^• 

i 

The binary variable u. is now defined as 
r 1 if X. = a. 

vi; ^g = 1 g = ^ \- 

0 if X. ^ a. 

V- 1 ig 



The vector oi arithmetic means of the binary variables is 

m = (n. /n) and the covariance matrix is S.. = ^yJ^-J^^- 

1 L ig/ >' 11 gn ig ^ ig in 

^i 

where ^ , is Kronecker' s S and n = ^ n. . (We assume that n has the 
gh g=0 ig 

same value, independent of i . ) Likewise, the covariance matrix between 

binary variables, corresponding to two x-variables, becomes 

S 'in /.x,*/\v/n - n. n., /n V. Here h and j are alternative indices of g and 
ij g(i)h(j)^ ig Jh^ ^ 

i, respectively, and J^g(i)h(j) number of objects which simultaneously 

belong to category g of x. and category h of x. . 

1 J 
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As a parenthesis, we may mention that the nonnumerical information 
of X., e. g. that contained in S... can be used by analogue to multivariate 
statistics. The determinant of a. covariance matrix is there one index of 



'eeneralized variance' andls..! = ( Ln /n ^ may be used as a measure 

& ' 11 g-U Ig 

of the nonnumerical 'variance' of x.. It is related to information theoretical 
measure s of uncertainty, see e.g. Fhaner (1966). 

For cases dealt with here, binary coding may be said to split up the 
information of x, in a nonnumerical part, u. = (u .g>, and a numerical-l-afr. 



a = -a 'j. We obtain the-fundamentaTTormula 

i_ " 

.1 

1 1 



I 1 3 

The arithmetic mean of x. becomes a .m.. and its variance a'.S.. a . , while 
the covariance of x. and can be written asa'.S^jaj. 

Let us now consider all x -variables simultaneously and define 
x = (x.}, m^ = {a!m;., u = {u;^m^={m.}andS^^={S.^^^^ We also 
need D, a block diagonal matrix^having a . on the principal diagonal and 

thus of order K x p. where K = 2 k.. Hence 
/3 X = D' u . 

It follows from formula 3 that = Dm and the covariance matrix 
of X will be S^= D'Suu^. 

In multivariate statistical analysis it is rather common to define 
new variables as a weighted sum of other variables, e.g. z = ex. 
We may take t = D C, meaning that z = t'u with m^ = t'm^ and = t'S^^t. 
Thus, the situation is the same as for one x-variable. (But see next section 
about formulations of restriction for monotonia transformati9ns. ) 

Indices of result _ _ _ - 

Almost all indices ofjre suit presented in this report has the following 
form for one dependent variable x (we now skip i and j): 



" a'Ga 



ERIC 



6 



5 - 



Both matrices are real and symmetric, and they.can be weighted sums 
of other, more basic matrices. For all Q here. G will be positive 
definite. There are. however, cases where G may be positive semidefinite, 
e.g. if Q is a F ratio for a random factor. In many cases F will be 

positive semidefinite (or definite) but we will also meet excepUons^ 

from this (F is indefinite). I think _that-.exeeptions a^rrather common in . 
.^onnectioTi-wrth'VaHanceTomponents. where negative values are possible. 
In some applications there may be other properties, e.g. each diagonal 
element of F cannot exceed the corresponding element of G. 

For standardized a . but no other restrictions, we can seek for an 
optimal scale in the whole' (k- 1) -dimensional real space' of a and thus 
have an eigenvalue problem as e.g. for common discriminant analysis. 
The only restriction taken up here is that of monotonic transformations, 
m most cases this will mean 0 < a^ < a^ < • • • • < aj^.^ < 1 • -The admissible 
a space is then a peculiarly cut 'piece of cheese' in the principle quadrant. 
Under certain circumstances, however, monotonic transformations can only 
involve blockwise ranking, for instance 0 < (a ^ . a^) < a3 < 1 . with no . 
ranking within blocks. Such a case will appear in this report. Also, for 
many x-variables the monotonic restriction implies that the t vector of the 
. last section will only be ranked within blocks (c.a .) but not between blocks. 
I believe that it is not unusual for optimal a to lie-ori the boundary of 
the admissible space. In particular, corner solutions seem to be 
•favoured' for min Q. as far as my brief experience hitherto shows.. that 
is. X is dichotomized. Some support for this belief concerning max Q is 
given by Bradley et al. (1962). They seem to have analysed rather a lot 
of data (one factor ANOVA) and often found boundary solutions, at least 
when k is large. 

The index-Q^aGcordin-g-fb--f5fm'uTri is not relevant for one of the 
"^ex'amples concerning a productmoment correlation. The problem is then 
simultaneously to code two x-variables and Q will have the general form. 

(a.F..a.)^ 

''^ Q =__L^i£___ 

^ a. G.. a. a .H.a . 

*i^ii 1 J jj J 

For the (squared) correlation, the matrices are different covariance 
matrices (between and within u. and Uj). 
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Stability 

For a certain Q-index and given restrictions, data are more stable, 
or more inseTisitive to admissible transformations, the lesser the 
difference "between the ma^iTnuTB--ahd-^ of Q. We will not use 

any special^ TtaSiTit>rn^^ in this report but it can be needed for 
certain comparisons. There are Q-indices, the range of which vary 
(e.g. as a function of n), and different Q-indices may have quite different 
ranges. For indices with finite ranges it is reasonable to relate the 
actual range (max Q - min Q for certain data) to the maximally possible 

range (without restrictions), e. g. define stability as 1 - (max Q - min Q 

1 

for certain data) /(maximal range). 

Total instability is obtained for data which have maximar^O'-^range. 
Som examples have data which are almost in this state. The opposite 
will be coined superstability, which means that no transformation - 
monotonic or not - can change Q. For fo:5^muia'^ this implies that- 
F is proportional to G- We will give two ^xamjjles of this remarkable 
property. It is finally obvious, for a definition of stability as of the last 
paragraph, that for two different restrictions, described by a € and 
a f with R ^ Ro, the stability cannot be, greater for R^ than for R^. 

Simulations 

Two types of simulations will be commented upon here, but only the 
first type has yet been performed. The type I subroutine produces 
rectangularly distributed random scale, values which are ranked and 
exploited for the calculation oi Q according to formula 4 (F and G are 
fixed and supplied by the main program)* The generation of a is repeated 
an arbitrarily number of times, thus giving a whole distribution of Q. 
It is of special interest to know the relative position of Q for equally 
spaced scale values. (You may here speak about a kind of inference, with 
the generated distribution as a sample distribution over scales.) T,he 
type I runs will also serve as a check of the optimization routines: if 
the simulated distribution contains more extreme values than those from 
the optimization routines, an error is indicated. 

The purpose with type II simulations is' to get a comprehension of the 
variation of the extreme values with repeated samples of measurement 

/ 



- 7 - 



objects from the same populUtTon. We will construct some convenient 
populations, take, a number of samples and apply the optimization 
subroutines. In this way we get ah estimate of the common kind of 
sample distribution of min Q and max Q, which is, no doubt, important. 
However, this type of analysis seems to be rather expensive and cannot 
always be made. I assume that some priority must be made: it may be 
necessary to elucidate this problem by only running type II simulations 
for the mo?*: -^ommon Q-indices. 

SOME EXAMPLES 

Most of the following examples are illustrated both with tables of 
basic data and with graphs on Q as a function of a . V/hile the tables o 
are presented on successive text pages, the graphs are collected in an 
appendix* The matrices F. C and H are not shown but they are easily 
retrieved from the appendix, where the functions are also given. 

Two simple ANOvA designs 

Factorial ANOVA with different cell samples has been studied by some 
authors (not all referred to here but see Meredith, Fredriksen & 
McLaughlin, 1974, for some further references) aiming at finding scales 
which optimize a certain effect,. Tukey (1950) is one of th^ first to 
solve, at least partially, this problem. He maximizes the F ratio but 
his method does not guarantee rank invariance. Box & Cox (1964) give 
this problem a more complete solution, but they restrict themselves to 
certain families of functions. In that respect the method described by 
Kruskal (1965) and Kruskal & Carmone (1969) is more general: it 
considers all functions within the class of monotonic transformations. 
This IS also the case with the method proposed by Bradley et al. (1962) 
and in this report. 

For a univariate one factor ANOVA with different samples, the total 
sum of squares is divided up into the sum of squares between groups 
(samples) and within groups. The corresponding cross product matrices 
in the multivariate case will be denoted T, B and W, respectively. We 
generate these matrices, of order k x k, by binary coding of a dependent 
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variable with k + 1 categories. The Q index used here, for given scale 
values a , will be the ratio of the sum of squares between groups to the 
total sum of squares. In accordance v/ith formula 4 this implies that 
F = B and G = T. 

The first example is taken from Larsson (1973). As is clear from , 
table 1, the factor has three levels and the dependent variable three 
categories. (The numbering of the latter only indicates order.) Figure 1 
of the appendix shows Q as a function of a . 

Table 1. Basic data of example 1 



■1 



3 
2 

1 . 
2 



I 0 10 

: 29 

|29 1 

iso ' 40 



10 
20 
0 
30 



20 
50 
30 
100 



The two eigenvalues 



become 0. 9079 and 0. 0069, of which the largest 
one happens to be generated by an admissible scale under the restriction 
of monotonic transformations. The minimum of Q with this restriction 
is 0. 1146. It can be added that the scale (0. a. 0.5. 1.0) gives a Q valUe of 
0.6610. We thus have a ver^ instable situation, where different monotonic 
transformations may generate quite different descriptions: the proportions 
of the total variance explained by group differences may differ as much 
as 79%. Notice also that the dichotomized scale (0. 0. 1.0. i.O)ismore 
sensitive to group discrimination than (0.0, 0.5. 1.0). I believe that 
this can be a rather" general finding: more scale values do not guarantee 
higher Q values. 

-The basic data of the next example are shown in table 2. It consists 
of two parts, each with two levels and a three -valued dependent variable. 
Figure 2 of the appendix gives both curves (Q as a function of a). 

Table 2. Basic data of example 2 





'^l 


^2 


2 


^1 


^2 


2 


3 


10 


25 


35 


0 


30 


30 


2 


20 


10 


30 


40 


0 


40 


1 


10 


25 


35 


0 


30 


30 


2 


40 


60 


100 


1 40 


60 


100 
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For both parts, data are constructed so that the scale (0. 0, 0.5, 1 . 0) 
gives a Q value of 0. 0000. The left part ha^ eigenvalues of 0. 1270 and 
0. 0000 and the maximal Q value for monotonic transformations is 
0.0293. Th^ right part, which involves extremely different distributions, 
h^s eigenvalues of I. 0000 and 0.0000, while the restricted Q maximum 
is 0. 2857. Thus, a Q value of zero for equally space;,d scale vcdues can 
be increased, though very dissimifar distributions seem to be needed for 
a substantial change. If the distributions have exactly the same forrn, 
Q will be superstable (will be zero independent of a). 

2x2 factorial ANOVA ^ ' 

For this case the crossproduct matrix between cells will be partioned 
into three matrices: for the main effect of factor A, Bj^ for the main 
effect of factor B, an<!l B. for the interaction effect. We use the same 
Q index as for one fac^tor ANOVA, which means that the numerator matrix 
of formula 4 if one of the B matrices, while C is still equal to^T. When, we 
describe an effect by this Q value, it is evident that the effect can be totally 
eliminated in the numerator matrix is positive semidefinite. This 
property is normally obtained when the degree of freedom, of the effe<:t 
is less than k. However, it is iai* from certain that a monotonic transfor- 
mation gives Q = 0. ■ I 

Two different examples will be given for the 2x2 factorial design 
with independent cell samples. The basic data o/ the first one is presented 
in table 3. Figure 3 of. the appendix shows the curves of the effects, inclu- 
ding that between cells. 

Table 3^^. Basic data of example 3 
A. A^ 



3 5 0 10 15 30 

2 15 10 5 • 10 40 

15 15 '10 0 30 

2 |25 25 25 25 100 



For equally spaced scale values we get = Q^g = 0. 1500 and 
Q = 0. 0000. The B effect is an instance of super stability, is 

B £j 

constantly zero, and its curve in figure 3 is not apparant as it coincides 
'with the horisontal axis. The eigenvalues of A and AB are both 0. 1917 
and 0. 0000, of which the highest one is associated with the admissible 
a space for monotonic transformations. With this restriction the minimal 
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Q value .s 0. 0476 for both effects. (Notice from figure 3 that the A and ^ 
AB curves are reflections of each other around a = 0. 5.) The sum effect 
(between cells>has eigeiwalues 0. 3000'and 0.0833". Here again the global 
maximum comes from the admissible a space but its minimum is 0. 2381. 
Thp Q value between cells is quite stable "for nvonotonic transformations, 
while that for A and Al* is not quite so stable. 

We have said that. whenever the distributions of different groups are 
identical, Q = 0 is a superstate result. However, super stability is not 
confined to zero effects, as will be shown by the next example. The basic 
dataL for this can be found in table 4, and figure 4 of the appendix illustrates 
.the functions. *^ 



Table 4. Basic data of example 4 



I 







^2 


B 
^1 


^2 




3 




40 


' 10 


20 ■ 


, 80 • 


2 


40 


10 


10 


20^ 


00 


1 


. 10 


10 


40 


20 


80 


T 


60 


60 


60 


60 


240 



, For the usual scale (0.0, 0.5, 1 . 0) we obtain = Qg = 0. 0938 and 
O = 0. 0000. The eigenvalues are 0. 1250 and 0. 0000 for all -three 
effects. When re .tricting ourselves to monotonia transformations, the 
restricted maxima and minima are 0., 1250 and 0. 03 13 for A and B, while ^ 
those for AB are 0. 0313 and 0. 0000, However, the remarkable property 

of this example is + Qb + ^AB = °"- ^"^^P^'^*^^^ °^ 

No transformation whatsoever can change the pr(^portion of the total 
variance due to differences of the cell means. I hk^e no idea whether data 
which, at least roughly, have thi-s proper^ty are common or not. Notice 
that the concept of super stability can be dependent on Q: it is not certain 
that an index describes a result as superstate, in spite of the fact that 
it has been so described by another index. 

There are se-eral cor^ceivable indices suitable for describing ANOVA 
results. Besides the proportion already used, we may mention the F- ratio 
and different combinations of variance components. As an example of an 
alternative index, we take Q = the ratio of the estimated 

variance component of factor A to the corresponding component of error, 
and apply this index to the first 2x2 factorial example. 
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We assume that A and B are both fixed and estimate the components 
by' equating the observed mean squares with their expected values. 
The index has the form shown by formula 4 with F = (B^ - W/96) / 50 
and G = W/96. Its lower limit is thus dejJe'ndent of data (here -1/50), 
while' the upper limit may be set to infinity. The eigenvalr''>s are\ for 
this example, 0.4786 och -0.0200; For monatonic trr ciohs, 
0. 1000 <, Q < 0.4786. As is seen from figure 5, the curve for this index • 
bears a close resemblance to the curve of figure 3. This may, however, 
be a mere coincidence. For instance, if we take the same index but assume 
the factors to be random, the Resulting curve is rather different from 
the Q. curve of figure 3. 

Reliability 

Determination of a weigHted sum of variables with maximal reliability 
is by no means a new problem. One of the older methods is presented e. g. 
in Lord & 'Novick (1968, pp. 123-124) and another more general method 
is described by Abelson (I960). These methods work with the same form 
of Q (see formula 4) as the method proposed here , but the matrices are 
not the same. Beside?, my method can guarantee a solution within the 
class of monotonic transformations and can be used for a single variable. 
This is not the dase with the other two methods. 

We shall take an example which admits a comparis^on with»Abelson' s 
.method. The example comprised a 'test' . comp^ed by two binary items, 
which is measured on ten persons on two occasions. The basic data are 
given in table 5 and a Q function in figure 6. 

Table 5. Basic data of example 5 
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Item 1 


Item 2 




Occ£>.sion 


Occa 


sion 






_ — 2 


1 


2 


1 






, 1 ■ 


1 


2 




0 


1 


* 
1 


3 




1 > 


1 


1 


4 




1 


0 


0 






1 


0 


0 


6 


0 


1 


1 


0 


7 


0 


0 


1 


0 


8 


0 


0 


1 


0 


9 


0 


0 




1- • 


10 


0 


'0 


0 


0 










13 



/ '1 



' AT may be interpreted as a correct answer, a "O" as a wrong ' 
answer. The test-has four possible outcomesHo. 0) . (0. i).(l.O) and (I. 1) 
with scale values 0. a^, 2.^ and 1, respectively. The restriction to 
monotonia transformations will here imply 0 < a^. a2 < 1. since there is no 
clear way of internally ranking the outcomes (0, 1) and (1 . 0). Table 5 
corresponds to a one factor ANOVA with repeated measures for a-^iven 
scale. In such a design the total sum of squares is split up into th/ee 
sums: for occasions O. persons P and interaction (plus error) OP, and 
.the same split is valid for the cross product matrices: T = Bq + Bp + 
' BQp. Th^ estimates of variance components relevant to reliability give, 
in this case.'-F = Bp - B^p and G •= Bp + B^p. 

The eigenvalues ar,e 0.8383. 0.6762 and -0.464';, none of which 
corresponds to the admissible a space for monotonic transformations. 
The common scale(0.0, O'.B. 0.5; 1.0)givesa reliability of 0. 6327. 
If we do not differentiate between 'the outcomes (0, 1) and ( 1 , 0), this 
value cao still be impro'ved on: the scale (0.0. 0.15. 0.15, 1.0) has . 
a reliability of 0. 7747. (It happens to be the larger eigenvalue for the 
Q.Iu^iction with.aj = a2. ) According to Spearman-Brown' s,formula. 
this is equal'to a doubled test with the common scale. 

Abel son' s method distinguishes between (0. 1.) and (1,0) but the 
solutioA is. for this example', 'confined to the line + a2 = 1- It gives 
■ the reliability v'aluc 0.6939. v>hich ^corresponds to 1.32 times the length' 
of the commonly scaled tesi. (It seems to me that Abelson's method 
coincides with mine when the outcome of the items is reproducible from 
' the sum score.) However, best of'all monotonic transformations is 
'"(0.0, 0.2, 0.0. 1.0). generating the value 0.8029 (2.37 times the 
length of-the common test). Finally, it can be said that the situation is . 
unstable: the minirr/um value is 0.0875 when (0, 1) and (1.0) are separated 
and 0.2162 withouj; this separation. 

The example presented above can be generalized to more complex 
univariate designs, e.g. those described by Cronbach et-al. (1972). /As • 
far as I can see this involves no mathematical novelties: it is only a 
matter of correctly choosing the weighted sums of basic matrices 
(from the ANOVA) which define F and G. 



Correlations 

The usual stability analy^s of a squared bivariate correlation involves 
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a Q index according to formula 5. As we shall see, however, there are. 

occasions when a correlation takes the form hitherto drscussed for Q. 
The first example makes use of data from table 1, where we now 

also assume the levels to be ordered (according to index numbers). 

Figure 7 of the appendix shows the Q function. The eigenvalues are . 

the^same as for example 1, 0.9079 and 0.0069, with the largest one 

coming from the admissible a space for monotqnic transformations. 

There are nonmonotonic transformations for which Q becomes zero, 

but for monotonic transformations the minimal Q value is 0. 0476. 

The scale (O.O; 0.5. 1 . 0) for both variable s gives a value of 0. 5 173 . 

This example thus shows a very unstable situation. 

Figure 7 also gives some relations between this e xam pi e a'nd example 
I The curve denoted is connected to figure 1: the curve of figure 1 
shows the height when following Pj. of figure 7. In the same^manner , - 
we get P^. the corresponding curve to Pj when independent and dependent 
variables change places ih example J. (Only parts of Pj and P^ are 

shown in figure 7. ) 

Suppose that it is desirable to determine the same scale for a number 
of variables with equally many categories and to define an average 
correlation as the ratio of the average covariance to the average 
variance. (The reliability estimate of elcample 5 is such a correlation.) 
We then have a correlation analysis where the form of Q is given by 
formula 4. No example of such an analysis is shown here, but we wiu 
instead present data of another correlation problem conformable to 

formula 4. r 

This examVe comprises 'real' data from a pilot study (n = 44).. 
The correlation problem concerns the relation between frequency 
statements (the number of days per year) and verbal statements for 
six different questions! The verbally anchored variables have categories 
labelled almost never, sekl^^^sometime s , often and almost always. 
The six questions asked rfcfer to 'how often yoU/i. watch TV, 2. go to 
th^ pictures, 3. wake up rested, 4. have a headache, 5. are stressed 
and 6. feel expectant. Some correlations between frequency statements 
and verbal statements are given in table 6 for each question. 
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Table 6. Some correlations of example 7 





min Q 


common Q 


max Q 


1 


0. 292 1 


0. TbZo 


n 7 c; c ^ 




0. 1464 


0 . 4 57 0 


r\ c 7 1 o 
U . DJiy 


3 


0. 1212 


. 0.7194 


0. 7447 


4 


0. 0659' 


0. 6065 


0. 6268 


5 


0. 0950 


0.4007 


0.4212 


6 


0. 0729 


0. 5838 


0. 6891 



It is reasonable, for this example, not to recode the frequency 

variables: we regard 'the number of days per year' as a fixed scale 

and are only interested in numerically coding the verbal categories 

to obtain minimal and maximal squared correlations between frequency 

statements and verbal statements. This problem gives a Q index 

Z 

according to formula 4, with F = S s'/s and G = S, Here s is the 

vector of covariances between the binary variables and the frequency. 

2 

variable, which has variance s , and S is the covariance matrix of the 
binary variables. . 

As F is positive semidefinite, it is possible to obtain zero correlations 
but they do not correspond to scales within the a space of monotonic 
transformations. The minimal Q values for this space are all generated 
by corner solutions, that is, the worst admissible dichotomizations. 
The restricted maximal Q value coincides with the greatest ei'jenvalue, 
except in questions' 2 and 3, which give the only boundary solutions, but 
their maximal Q values are almost the same as their nonzero eigen- 
values. The second column of table 6 refers to Q when the verbal scale 
has equally spaced values. We see that these comm^'on Q values are of the 
same magnitudes is the corresponding maximal values, perhaps with 
the exception of question 6. On the other hand,' the^ ability to predict 
frequency statements from verbal statements is In, no case very high. 

EXTENSIONS j 

This section contains some rather loose ideas about possible applications 
of stability analysis to multivariate statistical methods. I do not know if 
there are new numerical problen^s not encountered in univariate analysis. 



iG 



- 15 



Oi course, the multitude of values to determine may in itself raise 
diliicultics. : will now comment superficially upon principal component 
analysis. discriminant,analvsis. canonical correlation analysis and 

, / . > 

tacior analysis - — f 

\^o discuss principal component analysis only by treating the problem 
,-.-of-finding a weighted variable z - c'x^= c'Du = fu with maximal variance. 
For instance, for two variables with three categories each, we have 
t'=(c a^j. Cj. C2a2i. C2). The usual restrictions imply that a. . will 
be seperately ranked and that c'c = 1. However. I imagine that there 
will often be more restrictions. To use Joreskog's words, see e.g. 
Jbreskog(1973). every element of t can be fixed, constrained or free. 

Some variables, like the frequency variable of example 7. may be so 
well defined that its scaU vector is fixed. Another instance of fixed 
values is to predetermine c: you have a model about how z should be 
d^ined and investigate whether the best scaling reache s a sufficiently 
high variance. If you are not satisfied with the resulting variance then 
your model is not good under any monotonic transformations. In case 
some or all variables have the same number of categories it may be 
desirable to let the scale vectors be identical. This is a reasonable 
example of constrained values. Under some combination of fixed. r - 
constrained and free elements of t one is now interested in determining 
t such that m^x t'S^ i (an.d perhaps also m^ max fS^^t) is obtained. . 

Discriminint analysis is illustrated by finding the best discriminant 
function for a one factor design with independently sampled groups. 
Let B be the crossproduct matrix Vetween groups for u. and u - and 
T be^he corresponding matrix for the total group. We further define 
B = ] and T = (T..). The Q index can be t'Bt/t'Tt. which corresponds 
to formula 4. The K^talues of t may be restricted in different ways, 
analogous to the case of principle component analysis. To take a very 
restricted case, suppose that all x have k+1 categories and that we 
want to find a scale common to all x which gives maximal discrimination 
for the unweighted sum of the variables. Then c is fixed and D is 
constrained, so that there are only k-1 ranked values to determine. 
Other designs may also be treated. 

In ca-.onical correlation analysis we also use a second set of variables. 
Let y.. i = i q. have m. + l categories, mth scale vector b. and v. 
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as its binary variables, such that y. = ^ Define D^^ as a block 
diagonal matrix of order Mxq, having, b. on the principal diagonal 
{M = ^rn^). We further need a weighted sum d'y = d'D^^v = t|^v, where 
V - [v^. Finally, define the covariance matrices S = (S^ S^^ = 
' and S = (s '\ of orders MxM, KxM and KxK, respectively. 

U.V. uu U.U.' 

(For instance, the gei|eral element of S^^ is the covariance matrix 

between u. and v..) If we take Q as the squared correlation between cx 

1 J 2 

and dy it can be written as (t'^S^^^t^^) /(^'^S ^^t ^t'^^S ^^t and thus has 

the form according to formula 5. Of course, this is also applicable 

to multiple correlations, in which case q=l. a new example of 

restrictions we can mention c = d and = D^^, provided that p = q and 

k. = yi.. This is a reasonable constraint if x and y are the same 
1 1 

variables, measured on two occasions. 

For factor analysis, we are interested in scaling the manifest 
variables so that they fit, as well as possible, to a given factor model. 
Several goodness-of-fit criteria are conceivable, such as the common 
or generadized least squares criterion, a likelihood function or perhaps 
the index suggested by Tucker & Lewis (1973). In general, the factor 
model is not fully specified, meaning that there are factor parameters 
as well as scale values to determine. I imagine that this will imply 
an iterative process which 'walks' to andJ,ro between scale values and 
parameters: starting with a set of scales, one estimates the parameters 
which constitute the basis for a new set of scale values, and so on. 
If the fit is bad, the model is not compatible with data under any 
admissible transformations of the manifest variables, which is quite 
a general conclusion. 

FINAL REMARKS 

The intention of stability analysis is to get knowledge about how 

differently you can descrrbe^^e suits due to different scales-.— We-nriay 

imagine two classes of trankormations: R(Q), for which the Q index of 
result is invariant, and R(C), the admissble class of scale transforma- 
tions for a certain concept. The word 'admissible' has the following 
(loose) meaning: given a definition of a concept, the possible outcomes 
of the instrument chosen (to measure this concept) can be scaled 
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according to an\ element m R{C) witKout fundamental objections as 
to a change of the concept. The most common example in educational 
research woi|ld be the class of linear transfprmations for R(Q) and 
all monotonic transformations for R(C). 

Stability analysis is only necessary if R(Q) oR(C), since otherwise 
the result^ is totally stable. (If R(Q) ^ R(C), there are perhaps better, 
stable Q indices.) However, when R(Q) ^-:R(C) it is not unusual to 
choose a n^w Q, such that R(Q) * R(C). ^Several devices for this 
can be found in non parametric statistics. In my opinion, it is better 
to keep the original Q and sharpen the definition of the concept. Such 
that R(Q) " R(C) or, rf this is not possible, to perform a* stability 
analysis. Suppose we have a Q which we regard as a good description 
of data, but with R(Q) ^R(C). I cannot see any reason why we should 
lose information by choosing a new Q with R(Q) :5R(C) insteacrof 
performing a stability analysis. 

It may be clear from the examples that one is sometimes most 
interested in obtaining an extreme value of Q, e.g. a minimal* interaction 
or a maximal group differentiation. Discussion of such optimal 
scaling for more or less special cases is not rare in research literature. 
However, there may be occasions when one wants to report a typical Q 
value. This can be defined in several ways but let us take 
the expected value. This integral can be difficult to evaluate but type 1 
simulations discussed earlier give information .about the expected value. 
It is reasonable to use the arithmetic mean of the generated distribution. 

Of the examples dis^russed above, such simulations have been performed 
for examples 1, 3 and 7 with 200 repetitions. One can, for instance, ask 
if Q from the scale with equally sj>aced values is typical. We answer 
by reporting the standardized Q value: example 1 gives 0. 11, example 
3 gives 0. 14 for A, 0. 13 for AB an4 1.29 for between cells (the value of 
B is not defined due to super s'-ability) and example 7 has values between 
0.51 and 1.27. The answer is consequently not an unequivocal'/ye s or no. 
Moreover, when a measurement is made on different populati!()ns and/or 
the data are treated with different methods, stability, minimal Q, typical 
Q or maximal Q may vary. In conformity with a test having different 
reliabilities for different situations, it can also have different scales 
for different situations, provided that R(Q)^.R(C). 
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For c omplex methods, it may be difficult to construct an effective 
algorithm for scalnning R( C) in order to.find special Q values. An 
alternative is to resort to selected transformations and investigate 
the variation of Q among these. 1 have done this for factor analysis, 
Larsson (1974), and found the results robust to (some) monotonic 
transformations. However, this is not a satisfactory approach and one 
must at least try to use more general methods, like the one proposed 
m this report. 

Binary coding is not the only alternative here. It seems to me that 
one can also use polynomials. For an arbitrary, monotonic scoring, w, 

(2 k\ 
„^ .^j, „ w, w , . . . , w ; 

corresponds to u and the polynomial coefficients correspond to a . But 
the formulation of the monotonic restriction is probably more complicated: 
instead of only ranking the elements of a . you now have to rank weighted 
sums of,the coefficients. 

When k is large the use of a polynomial may be advantageous. For 
instance, a truly continuous variable implies k = n-i, an 'impossible' 
number of categories to work with. The problem is to reduce the number 
-by putting together categories with lowest possible distortion of data. 
For a polynomial, the 'obvious' way is to reduce its degree but I do 
not know how to handle the binary variables. 

Provided that the optimization routines turn out to be dependable, 
it is my intention to investigate the stability of some univariate statistical 
methods on various data sets. It may be interesting to know whether 
stability varies with e.g. different educational researcV^reas, different 
statistical methods and different numbers of categories. The investigation 
will give access to programs designed to determine minimal and maximal 
Q (and perhaps typical Q) for some statistical methods. It seems to me 
to be more sensible to report minimal and maximal Q, perhaps along 
with Q for equally spaced scale values, than only the latter. Suppose., 
that the latter Q is 0.25 in two different cases (pq^sible range of Q: 
0 < Q < 1). Suppose further that 0. 00 < Q < 0. 75 in the first case and 
0. 20 < Q < 0. 30 in the second case. I do not think that the stability 
information will cause one to judge the cases identically, although Q for 
equally spaced scale values is the same in both cases. 
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APPENDIX 

Figure 1. The Q index of example 1. 

Figure 2. The Q indices of example 2, 

Figure 3. The Q indices of example 3. 

Figure 4. The Q indices of example 4. 

Figure 5. Alternative Q index for factor A of example 3, 

Figure 6. The Q index of example 5. 

Figure 7. The Q index of example 6. 
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Figure 6. The Q index of example 5 
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This report gives some simple examples of stability for 
o^ factor and 2x2 factorial analysis of variance, 
reliability and correlations. The findings are very 
^different: from super stability (no transformation whatso- 
ever can change the result) to almost total instability. 
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